<!-- To compile slides with relative paths, use --> <!-- xaringan::inf_mr(cast_from = '../..') --> <!-- For self contained, just knit with markdown --> <style type="text/css"> /* Apply styles only to the first slide with the title-slide class */ .title-slide { display: flex; justify-content: center; align-items: left; flex-direction: column; height: 800px; text-align: left; } /* Style for the logo in the title slide */ .title-slide img { float: left; margin-right: 20px; margin-left: 0px; width: 100px; } /* Style the title */ .title-slide h1 { color: #15803d; font-size: 40px; font-weight: bold; margin-bottom: 20px; margin-left:0px; } /* Style the subtitle */ .title-slide h3 { color: #6b7280; font-size: 30px; padding-bottom: 10px; margin-bottom: 0px; padding-left: 0px; /* Ensure padding is consistent */ text-align: left; /* Align text to the left */ } /* Style the author and date */ .title-slide p { color: #6b7280; font-size: 18px; padding-bottom: 10px; padding-left: 0px; /* Ensure padding is consistent */ text-align: left; /* Align text to the left */ } .inverse .remark-slide-number { display: none; } .remark-slide-number { font-size: 14px; } /*td is for cells, th for headers*/ table.rmdtable { width: 90%; border-collapse: -moz-initial; border-spacing: 2px; font-size: 13px border-bottom: 0px solid #797979; } table { font-size:20px; } table.rmdtable td, th { font-size: 18px; padding: 1em 0.5em; } table.rmdtable th { color: white; font-size: 22px; font-weight: bold; background: -webkit-linear-gradient(top, #02934a 40%, #02934a 80%) no-repeat; } table.rmdtable tr > td:first-child, table th { font-weight: normal; } </style> .title-slide[ <img src="data:image/png;base64,#../../img/logo.png" width="40%" style="display: block; margin: auto auto auto 0;" /> #Data Handling: Import, Cleaning and Visualisation ### Lecture 2: Programming with Data Dr. Aurélien Sallin<br>26/09/2024 ] --- class: center, middle, inverse, Large # Recap --- <img src="data:image/png;base64,#../../img/data_science_pipeline.png" width="85%" style="display: block; margin: auto;" /> --- ## .green[**At the end of the course, you will be able to...**] - **Understand the tools you need when working with data** - **Work independently with data** - **Ask the right questions to a dataset** - **Learn to communicate about data** --- class: right, bottom background-image: url("data:image/png;base64,#https://upload.wikimedia.org/wikipedia/commons/thumb/d/d6/Apprenticeship.jpg/1280px-Apprenticeship.jpg") background-size: cover <style type="text/css"> .slide { position: relative; width: 100%; height: 100%; } .legend-pic { position: absolute; bottom: 0; right: 0; padding: 10px; /* Optional: Add some padding */ white-space: nowrap; } </style> .slide[.legend-pic[.small[.white[A shoemaker and his apprentice c.1914, Emile Adan]]]] --- class: center, middle, inverse, Large # Basic Programming Concepts --- <p style="text-align: center;font-size: 25pt;position: fixed;top: 50%;left: 50%; transform: translate(-50%, -50%);">Values and variables</p> --- <p style="text-align: center;font-size: 25pt;position: fixed;top: 50%;left: 50%; transform: translate(-50%, -50%);">Vectors</p> --- ## Vectors <img src="data:image/png;base64,#../../img/numvec.png" width="10%" style="display: block; margin: auto;" /> --- ## Vectors <img src="data:image/png;base64,#../../img/charvec.svg" width="10%" style="display: block; margin: auto;" /> --- <p style="text-align: center;font-size: 25pt;position: fixed;top: 50%;left: 50%; transform: translate(-50%, -50%);">Matrices</p> --- ## Matrices <img src="data:image/png;base64,#../../img/matrix.png" width="40%" style="display: block; margin: auto;" /> --- ## Matrices are combinations of vectors -- .pull-left[ ``` r *cbind(c(1,2,3), c(4,5,6), c(7,8,9)) rbind(c(1,4,7), c(2,5,8), c(3,6,9)) matrix(nrow=3, ncol = 3, 1:9) ``` ] .pull-right[ ``` ## [,1] [,2] [,3] ## [1,] 1 4 7 ## [2,] 2 5 8 ## [3,] 3 6 9 ``` ] -- .pull-left[ ``` r cbind(c(1,2,3), c(4,5,6), c(7,8,9)) *rbind(c(1,4,7), c(2,5,8), c(3,6,9)) matrix(nrow=3, ncol = 3, 1:9) ``` ] .pull-right[ ``` ## [,1] [,2] [,3] ## [1,] 1 4 7 ## [2,] 2 5 8 ## [3,] 3 6 9 ``` ] -- .pull-left[ ``` r cbind(c(1,2,3), c(4,5,6), c(7,8,9)) rbind(c(1,4,7), c(2,5,8), c(3,6,9)) *matrix(nrow=3, ncol = 3, 1:9) ``` ] .pull-right[ ``` ## [,1] [,2] [,3] ## [1,] 1 4 7 ## [2,] 2 5 8 ## [3,] 3 6 9 ``` ] --- <p style="text-align: center;font-size: 25pt;position: fixed;top: 50%;left: 50%; transform: translate(-50%, -50%);">Math operators</p> --- ## Math operators: basic arithmetic .pull-left[ ``` r # basic arithmetic 2+2 sum_result <- 2+2 sum_result sum_result -2 4*5 20/5 # Modulo (remainder) 5 %% 3 # Integral division 5 %/% 3 ``` ] .pull-right[ ``` ## [1] 4 ``` ``` ## [1] 4 ``` ``` ## [1] 2 ``` ``` ## [1] 20 ``` ``` ## [1] 4 ``` ``` ## [1] 2 ``` ``` ## [1] 1 ``` ] --- ## Math operators: other operators ``` r # other common math operators and functions 4^2 sqrt(4^2) log(2) exp(10) log(exp(10)) ``` --- <p style="text-align: center;font-size: 25pt;position: fixed;top: 50%;left: 50%; transform: translate(-50%, -50%);">Loops</p> --- ## for-loop <img src="data:image/png;base64,#../../img/forloop_black.png" width="35%" style="display: block; margin: auto;" /> --- ## for-loop ``` r # number of iterations n <- 100 # start loop for (i in 1:n) { # BODY } ``` --- ## while-loop <img src="data:image/png;base64,#../../img/while_loop_black_own.png" width="50%" style="display: block; margin: auto;" /> --- ## while-loop ``` r # initiate variable for logical statement x <- 1 # start loop while (x == 1) { # BODY } ``` --- <p style="text-align: center;font-size: 25pt;position: fixed;top: 50%;left: 50%; transform: translate(-50%, -50%);">Logical statements</p> --- ## Logical statements ``` r 2+2 == 4 # is equal to 3+3 == 7 4!=7 # is not equal to 6>3 6<7 6<=6 ``` --- <p style="text-align: center;font-size: 25pt;position: fixed;top: 50%;left: 50%; transform: translate(-50%, -50%);">Control statements</p> --- ## Control statements ``` r condition <- TRUE if (condition) { print("This is true!") } else { print("This is false!") } ``` ``` ## [1] "This is true!" ``` --- class: center, middle, inverse, Large # Functions --- ## Functions <p style="text-align: center;font-size: 25pt;">$$f:X \rightarrow Y$$</p> --- ## Functions <p style="text-align: center;font-size: 25pt;">$$2\times X = Y$$</p> --- ## Functions in R Functions in R are either **built-in** or **user-defined**. Load built-in functions from a R-package: ``` r # install a package install.packages("<PACKAGE NAME>") # load a package library(<PACKAGE NAME>) ``` --- ## Functions in R Functions have three elements: 1. .green[*formals()*], the list of arguments that control how you call the function 2. .green[*body()*], the code inside the function 3. .green[*environment()*], the data structure that determines how the function finds the values associated with the names (not the focus of this course) --- ## Functions in R ``` r myfun <- function(x, y){ # BODY z <- x + y # What the function returns return(z) } ``` #### Formals ``` r formals(myfun) ``` ``` ## $x ## ## ## $y ``` --- ##Functions in R ``` r myfun <- function(x, y){ # BODY z <- x + y # What the function returns return(z) } ``` #### Body ``` r body(myfun) ``` ``` ## { ## z <- x + y ## return(z) ## } ``` --- ##Functions in R ``` r myfun <- function(x, y){ # BODY z <- x + y # What the function returns return(z) } ``` #### Environment ``` r environment(myfun) ``` ``` ## <environment: R_GlobalEnv> ``` --- ## Functions in R Example of a function: a simple power function ``` r powerFunction <- function(base, exponent){ results <- base ^ exponent return(results) } powerFunction(exponent = 2, base = 3) powerFunction(base = 2, exponent = 3) powerFunction(2, 3) powerFunction(c(2,4,3), 3) ``` --- class: center, middle, inverse, Large # Step-up your game: Functionals --- ## Functionals - A functional is a function that takes a function as an input and returns a vector or a list as output. - Functionals are alternative to for-loops. - Functionals can be programmed using the `apply` family (`apply`, `lapply`, `tapply`), or the `purrr::map()` family. --- ## Functionals: representation with `map()` <img src="data:image/png;base64,#../../img/functionals.png" width="50%" style="display: block; margin: auto;" /> --- ## Example of a functional ``` r # Install purrr and load # library(purrr) # Set a user-defined function triple <- function(x) x * 3 # With lapply lapply(1:3, triple) ``` ``` ## [[1]] ## [1] 3 ## ## [[2]] ## [1] 6 ## ## [[3]] ## [1] 9 ``` ``` r # Apply the function to a vector. map_dbl Returns a vector map_dbl(1:3, triple) ``` ``` ## [1] 3 6 9 ``` --- ## The "apply" functional `apply` applies a function to columns or rows of a matrix. It loops over rows (`MARGIN = 1`) or columns (`MARGIN = 2`) of a matrix. .pull-left[ ``` r # Empty matrix with 2 rows and 4 columns mymatrix <- matrix(c(1,2,3,11,12,13,1,10), nrow = 2, ncol = 4) ``` ] .pull-right[ ``` ## [,1] [,2] [,3] [,4] ## [1,] 1 3 12 1 ## [2,] 2 11 13 10 ``` ] Apply a sum function on each column .pull-left[ ``` r apply(mymatrix, MARGIN = 2, sum) ``` ] .pull-right[ ``` ## [1] 3 14 25 11 ``` ] --- class: center, middle, inverse, Large # Recap --- ## Loops and functionals | <div style="width:290px">**for-loops**</div> | **functionals** | |:-:|:-:| | Gets messy fast | Elegant and readable, allow for better syntax | | Long code | Short code, based on functions | | Standard base R | `apply` family: standard R. `map` family: `purrr` package| --- class: center, middle, inverse, Large # Tutorials --- class: center # Tutorial 1: A Function to Compute the Mean Starting point: we should be aware of how the mean is defined: `$$\bar{x} = \frac{1}{n}\left (\sum_{i=1}^n{x_i}\right ) = \frac{x_1+x_2+\cdots +x_n}{n}$$`. --- ## Tutorial 1: A Function to Compute the Mean ``` r ##################################### # Mean Function: # Computes the mean, given a # numeric vector. meaN <- function(x){ } ``` --- ## Tutorial 2: on slow and fast sloths We can use loops to simulate natural processes over time. Write a program that calculates the populations of two kinds of sloths over time. At the beginning of year 1, there are **1000 slow sloths** and **1 fast sloth**. This one fast sloth is a new mutation that is genetically able to use roller blades. Not surprisingly, being fast gives it an advantage, as it can better escape from predators. | A slow sloth |A fast sloth in its natural element | |:-:|:-:| | <img src="data:image/png;base64,#../../img/slowsloth.png" height="260"/> <br> | <img src="data:image/png;base64,#../../img/fastsloth.png" height="260"/> <br> | --- ## Tutorial 2: on slow and fast sloths Each year, each sloth has one offspring. There are no further mutations, so slow sloths beget slow sloths, and fast sloths beget fast sloths. Also, each year 40% of all slow sloths die each year, while only 30% of the fast sloths do. At the beginning of year one there are 1000 slow sloths. Another 1000 slow sloths are born. But, 40% of these 2000 slow sloths die, leaving a total of 1200 at the end of year one. Meanwhile, in the same year, we begin with 1 fast sloth, 1 more is born, and 30% of these die, leaving 1.4. |Beginning of Year |Slow Sloths |Fast Sloths | |---|---|---| |1 |1000 |1 | |2 |1200 |1.4 | |3 |1440 |1.96 | **Enter the first year in which the fast sloths outnumber the slow sloths.** --- ## Tutorial 3: A loop function Be the function ``` r appendsums <- function(lst){ #Repeatedly append the sum of the current last three elements #of lst to lst. } ``` Create a function that repeatedly appends the sum of the current last three elements of the vector lst to lst. Hint: use the `append` and `tail` functions. Your function should loop 25 times. To check if your function is correct, run: ``` r sum_three = c(0, 1, 2) appendsums(sum_three) # Solution for testing: sum_three[10] == 125 ``` --- # Q&A <style> slides > slide { overflow: scroll; } slides > slide:not(.nobackground):after { content: ''; } code { color: white; } pre { color: white; } </style> <!-- ## References {.smaller} -->